ABE

Section: User Commands (1)
Index Return to Main Contents

NAME

abe - Ascii-Binary Encoder

SYNOPSIS

abe [ options ] [filename ...]

DESCRIPTION

The abe program program encodes binary files into a bullet-proof form consisting only of printable ASCII characters. This new form can be sent through communications channels which might get upset at non-printable characters, such as USENET news, mail and various text file downloading programs. ABE files should be able to pass through a lot of mechanisms and Operating Systems that will kill lesser files.

Abe is a replacement for the uuencode(1) program. The encodings produced by abe are usually smaller, more compressible, more readable and far more bullet-proof than those produced by uuencode.

All lines in an ABE encoding have a three character line number as well as a checksum. That means that ABE lines may be broken apart, scrambled in a random order, and even have garbage lines inserted into them without damage. The sort(1) program (or any other text file sort utility) can always restore an ABE file to its proper state.

ABE files can be split into "blocks" when the transport mechanism being used is unable to transfer files longer than a given length. These blocks contain checksums, length information and `seek address' information for independent verification. With the full dabe ABE decoder, it is possible to still decode a file with missing blocks. Empty regions will simply be left undefined in the resulting file. If redundant decoding information is added to the blocks, they can be presented to the decoder in any order, without sorting, and blocks may even be duplicated. All this was designed with the typical problems of USENET binary distribution in mind.

Two decoders exist. One is the `tiny' decoder, tinydabe.c. This is a 100 line, public domain, portable C program which can be included with ABE files. Thus any person with a C compiler can decode an ABE file, even if they have never heard of ABE files before. It is limited to single file encodings of less than 2 megabytes in size.

The full ABE decoder, dabe, more advanced decoding, with more error checking, is possible. It is suggested that the tiny decoder only be used by first time users of the format, and those who plan more work should endeavour to use the complete decoder.

OPTIONS

(Note that while option names are displayed here in full, only the first letter is actually required. For +/- options, using + turns the option on, and using - turns the option off.)

blocksize=num: Request that files be split into blocks with an approximate size of num. Note that files will actually be a little bit larger than the requested size, so choose a number lower than your hard maximum. Blocks will be put into the single output file unless an output file prefix name is provided (p=name).
prefix=str: Normally, abe writes encodings to the standard output. This option turns on file blocking, and arranges for each block to go into a different file. All file names will start with the prefix str and will have a 2-digit hexadecimal number at the end. The default block size is 40,000 characters, but that may be set with the (b=num) option.
prefix=|command: On UNIX systems, if the prefix string begins with an or-bar (|), the blocks will actually be piped through a shell process using popen(3). The shell command string passed to popen will be that generated by sprintf(3) with the prefix string (excluding the or-bar) given as the format string, and the file number given as an integer argument. For example, on Unix: abe b=25000 file "p=|mail -s 'Part%d' fbaggins"
would mail all the blocks, with titles, to user fbaggins. Note that you must quote the whole option, or the or-bar will be taken as a pipe character by the Unix shell.
universalname=name: ABE encodings include both the real name of the encoded file and a special universal name that is limited to 12 characters and should contain no directory characters like slash. The universal name is used when decoding on an operating system different from the encoder's system. Universal names are also used when multiple files are placed in the same encoding. If you don't provide a universal name, one will be formed from the real file name. You can only provide your own universal name when encoding a single file. If no filename is given, a universal name of "stdin" is used.
decoder=pathname: Insert the source to the tiny ABE decoder "tinydabe.c" from the file in pathname.
sample=size: abe and do either a single pass or double pass over its input, except when the input is the standard input, in which case only a single pass is possible. abe likes to do two passes so that it can get frequency tables for the bytes in the input file. The more accurate the frequency tables, the smaller the encoding. If two passes are not possible, or you request one-pass operation with this option, abe reads in a buffer of size size and builds the frequency table from that. The default (for stdin) is 10,000 bytes. You can set it as high as the limit for dynamic memory allocation on your system.
linenumber=num: Normally ABE encodings start at line one. If you wish to concatenate two encodings, you can start your second encoding at a higher line number with this option. You give the number in decimal, although it will be output in ABE's special format of 3 printable characters. Encodings that don't start at line 1 will be rejected by the tiny dabe decoder.
+redundant: In a blocked encoding, this option asks that redundant information be added to each block, so that the file may be decoded without sorting by the advanced ABE decoder, even if blocks are missing, duplicated or in the wrong order.
+decoder: Request that the source for the tiny ABE decoder be inserted into your ABE encoding. The source is to be taken from the standard location defined by your system administrator.
+ebcdic: Use the ABE2 encoding, which is designed to pass through EBCDIC machines without trouble. The ABE2 encoding does not make use of the following characters: "![\]^`{|}~" -- they have all been reported to sometimes not survive multiple ASCII<-->EBCDIC translations. It maps to 4 sets of 64 characters, and produces encodings that are slightly larger.
+uuencode: Use the UUENCODE encoding scheme. These scheme is totally unlike ABE schemes, but is quite popular and sometimes a bit smaller on compressed binary files. UUENCODE lines are formed from the 64 characters from space to underbar, with a simple mapping that maps 4 printable characters to 3 binary bytes. Files produced with +uuencode can often be decoded by uudecode(1) decoders after the application of sort and a simple sed(1) script to remove the first four bytes of each line. If you use the -number option, the sed script is not even necessary The UUENCODE format is more prone to errors and usually is more bulky than either ABE format.
-numbers: Remove line numbers and line checksums from most of the encoding. The first few lines of every block will still have line numbers, but the bulk of the encoding will not. This saves 4 bytes per line, reducing the size of encodings by about 6%. Such encodings can't be decoded by tinydabe decoders, nor can they be sorted. While they are more prone to potential errors, most such errors occur between blocks, so removing the line numbers is usually safe. When used with +uuencode, this option allows encodings that can be decoded both with dabe(1) and uudecode(1).

OPERATION

Abe can take input in 3 ways. The first is the standard input, which allows abe to be used at the end of a pipe. If the standard input is used, abe works in one-pass mode, and only reads the first part of the file to figure out character mappings.

Abe may also be given a single filename, in which case that file will be encoded. An alternate output name can be provided with the "universalname=" option.

Abe can also be given multiple files. The output will be roughly equivalent to the concatenation of single-file ABE encodings, except the line numbers will continue properly in sequence. This produces a sort of multi-file archive, although abe is not intended to be used as an archiver. In fact, it is better to use abe on the output of general non-compressing archivers like tar(1) or cpio(1). It can also be used on compressed archiver output, but generally it's better to let the transport mechanism (usually USENET links) worry about doing the compressing.

ENCODING FORMAT

In the standard ABE1 encoding, 256 bytes are broken up into 3 sets, with 86, 86 and 84 bytes, respectively. The most common 86 bytes in the file go into set 0, and so on. 86 of the printable ASCII characters are used to encode the members of each set. Special printable escape characters switch from set to set.

In an ABE encoding, printable characters always map to themselves, if possible. This means that printable character strings found in binary files are still readable in an ABE encoding. You can often look at a raw ABE file and see what it is, which is quite useful. In addition, the byte 0 maps to the ASCII digit "0," and several other similar useful mappings are made.

ABE files also have header information that defines information about the encoded files, block headings, sizes and checksums. For full details on the encoding format, see the special file on that in the ABE kit.

The ABE2 encoding splits the 256 bytes into 4 sets of 64 bytes each. It avoids certain dangerous characters. Otherwise it is similar to ABE1. ABE2 encodings are only slightly larger, and slightly less readable than ABE1 encodings

COMPRESSION

ABE files usually always use the same string of printable characters to represent a given string of printable bytes. (This is not true for uuencodings.) This is good for LZW compressors.

ABE encodings are very good on text files. In general, except for the overhead of headers, checksums and line numbers, text files encode to the same size in an ABE file. Sadly, ABE does its worst job on compressed files. This "worst job" is usually about the same as the job done by uuencode, plus the overhead of headers, checksums and line numbers. In general, files posted to USENET should not be pre-compressed, as compression should be left to the transportation mechanisms. (Most USENET links batch and compress what they transmit.)

BLOCKING

The ABE blocking system is ideal for sending binaries over USENET and other limited channels. Normally, ABE output is a continuous stream sent to the standard output.

AUTHOR

The ABE system was written by Brad Templeton, who is brad@looking.on.ca. (Mail regarding abe should go to abe@looking.on.ca.) The tiny ABE decoder is released to the public domain. All other files are Copyright 1989 by Brad Templeton. A licence for unlimited non-commercial use of these encoders is granted. See the source code in the ABE kit for full details on the licence.

No fee is requested or required for the use of these programs. If you feel the need to show appreciation, You might order copies of the REC.HUMOR.FUNNY Computer Network Humour Annual(s) (a USENET jokebook) for 9.95 USD+S/H. Mail to jokebook@looking.on.ca or call 519/884-7473. There is no requirement to buy the jokebook in order to use these programs.

FILES

tinydabe.c

VERSION

Version 1.0

This document was created by man2html, using the manual pages.
Time: 11:18:47 GMT, November 24, 2024

ABE